A flexible framework for deriving assertions from electronic medical records
نویسندگان
چکیده
OBJECTIVE This paper describes natural-language-processing techniques for two tasks: identification of medical concepts in clinical text, and classification of assertions, which indicate the existence, absence, or uncertainty of a medical problem. Because so many resources are available for processing clinical texts, there is interest in developing a framework in which features derived from these resources can be optimally selected for the two tasks of interest. MATERIALS AND METHODS The authors used two machine-learning (ML) classifiers: support vector machines (SVMs) and conditional random fields (CRFs). Because SVMs and CRFs can operate on a large set of features extracted from both clinical texts and external resources, the authors address the following research question: Which features need to be selected for obtaining optimal results? To this end, the authors devise feature-selection techniques which greatly reduce the amount of manual experimentation and improve performance. RESULTS The authors evaluated their approaches on the 2010 i2b2/VA challenge data. Concept extraction achieves 79.59 micro F-measure. Assertion classification achieves 93.94 micro F-measure. DISCUSSION Approaching medical concept extraction and assertion classification through ML-based techniques has the advantage of easily adapting to new data sets and new medical informatics tasks. However, ML-based techniques perform best when optimal features are selected. By devising promising feature-selection techniques, the authors obtain results that outperform the current state of the art. CONCLUSION This paper presents two ML-based approaches for processing language in the clinical texts evaluated in the 2010 i2b2/VA challenge. By using novel feature-selection methods, the techniques presented in this paper are unique among the i2b2 participants.
منابع مشابه
Model Formulation: Modeling Electronic Discharge Summaries as a Simple Temporal Constraint Satisfaction Problem
OBJECTIVE To model the temporal information contained in medical narrative reports as a simple temporal constraint satisfaction problem. DESIGN A constraint satisfaction problem is defined by time points and constraints (inequalities between points). A time interval comprises a pair of points and a constraint. Five complete electronic discharge summaries and paragraphs from 226 other discharg...
متن کاملAutomated identification of medical concepts and assertions in medical text.
This paper describes a machine learning, text processing approach that allows the extraction of key medical information from unstructured text in Electronic Medical Records. The approach utilizes a novel text representation that shares the simplicity of the widely used bag-of-words representation, but can also represent some form of semantic information in the text. The large dimensionality of ...
متن کاملDeveloping a Standardized Medical Speech Recognition Database for Reconstructive Hand Surgery
Fast and holistic access to the patients’ clinical record is a major requirement of modern medical decision support systems (DSS). While electronic health records (EHRs) have replaced the traditional paper-based records in most healthcare organization, the data entry into these systems remains largely manual. Speech recognition technology promises substitution of the more convenient speech-base...
متن کاملHow to Standardize Electronic Medical Records
Introduction: One of the key elements of success of health institutions is Standardization. This study introduces the methods and stages of electronic medical records standardization. Methods: The present study is a narrative review of the studies on the stages and methods of electronic medical records standardization. Results: The process of standardization of electronic medical records incl...
متن کاملA Framework for Assessing Adherence and Persistence to Long-Term Medication
Poor adherence and persistence to long-term medication is a growing concern worldwide. Despite their importance, tools that facilitate the identification of patients who show poor adherence and persistence rates are limited. Herein we present a framework we have developed to assist in assessing adherence and persistence rates. We demonstrate the framework's features using production electronic ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of the American Medical Informatics Association : JAMIA
دوره 18 5 شماره
صفحات -
تاریخ انتشار 2011